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Abstract —  This  work  presents  an  automatic  algorithm  to 
measure  fibrosis  in  muscle  sections  of  mdx  mice,  a  mutant 
species  used  as  a  model  of  the  Duchenne  dystrophy.  The  al¬ 
gorithm  described  herein  automatically  segments  three  dif¬ 
ferent  tissues:  Muscle  cell  tissue  (MT),  Pure  collagen  fiber 
deposit  (CD)  and  cellular  infiltrates  surrounded  by  loose  col¬ 
lagen  deposit  (Cl),  by  using  a  statistical  classifier  based  on 
the  /c-Nearest  Neighbour  (k- NN)  decision  rule  in  the  RGB 
color  space.  The  algorithm  is  trained  by  selecting  a  num¬ 
ber  of  correctly  classified  pixels  from  each  class.  The  fc-NN 
rule  classifies  other  pixels  in  the  class  that  is  most  repre¬ 
sented  among  the  k  nearest  training  samples  in  the  RGB 
space,  which  is  efficiently  implemented  with  a  fast  k-distance 
transform  algorithm.  All  extracted  areas  are  quantified  in 
absolute  (/ im 2)  and  relative  (%)  values.  For  validation  of 
this  method,  the  different  tissues  were  manually  segmented 
and  their  quantifications  statistically  compared  with  those 
obtained  automatically.  Statistical  analysis  showed  inter¬ 
operator  variability  in  manual  segmentation.  Automatic 
quantifications  of  the  same  areas  did  not  differ  significantly 
from  their  mean  manual  evaluations.  In  conclusion,  this 
method  produce  fast,  reliable  and  reproducible  results. 

Keywords —  Automatic  morphometry,  fibrosis  morphome¬ 
try,  muscle  measurements,  fe-NN. 

I.  Introduction 

Fibrosis  is  a  common  consequence  of  a  large  number  of 
pathologies  in  different  organs  and  its  quantification  stands 
for  a  fundamental  step  towards  correct  evaluations.  These 
kind  of  measurements  have  been  addressed  by  different 
methods.  Semiquantitave  stereological  approaches  have 
been  used  in  histomorphometrical  problems  whose  aim  was 
to  quantify  interstitial  fibrosis  by  using  the  point-counting 
technique  [1],  [2],  [3],  [4].  However,  provided  that  in  mus¬ 
cular  distrophies,  the  fibrotic  areas  are  not  homogeneously 
distributed  [5],  [6],  [7],  [8],  [9],  the  amount  of  work  needed 
for  an  operator  to  obtain  representative  measurements  of 
each  of  the  different  pathologic  events  becomes  a  consider¬ 
able  burden.  In  addition,  manual  measurements  are  highly 
time  consuming,  not  extremely  precise  and  depend  on  the 
skill  of  the  experimenter;  resulting  in  a  large  both,  inter  and 
intra-observer  variabilities.  On  the  contrary,  digital  image 
analysis  provides  more  reliable  and  reproducible  quantita¬ 
tive  measurements  than  conventional  methods  [10]. 

The  present  investigation  presents  the  development  and 
validation  of  a  new  automatic  method  to  quantify  the  in¬ 
terstitial  fibrosis  in  muscle  images.  Firstly,  a  few  samples 
of  correctly  classified  pixels  are  manually  selected  to  obtain 
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a  good  representation  of  the  different  tissues  (categories) 
in  the  image  to  classify.  Next,  these  picked  pixels  are  used 
to  train  a  statistical  classifier  in  the  RGB  color  space.  The 
classifier  used  here  is  based  on  the  fc-Nearest  Neighbors  (k- 
NN)  classification  rule  [11],  [12],  which  is  a  technique  for 
non-parametric  pattern  classification.  This  rule  classifies 
a  given  pixel  within  the  category  most  heavily  represented 
amongst  its  fc-Nearest  Neighbors,  in  the  pattern  space.  Ev¬ 
ery  pixel  in  the  RGB  space  is  then  associated  to  a  deter¬ 
mined  category,  according  to  the  k- NN  rule.  Finally,  the 
complete  data  set  with  the  images  of  the  whole  muscle  sec¬ 
tion  is  then  automatically  classified  by  using  an  efficient 
implementation  of  this  k- NN  rule. 

II.  Materials  and  Methods 

A.  Histological  Material 

Male  mdx  mice  used  in  an  unrelated  experiment  were 
anaesthetised  by  a  subcutaneous  injection  of  a  mixture  of 
ketamine  hydrochloride  50  mg /ml  (ketalar)  and  xylazin  hy¬ 
drochloride  2%  (Rompun).  The  tibialis  anterior  muscle  was 
then  dissected  and  completely  sectioned.  Serial  7  /am- thick 
transverse  sections  were  made  from  the  mid-belly  region 
of  the  muscles,  using  a  cryostat  (Reichter-Jung  Cryocut 
E),  and  mounted  on  microscope  slides  (Superfrost).  Then, 
sections  were  stained  with  1  %  picro-Sirius  red  (Direct  red 
80,  Fluka  Sigma),  washed  with  two  changes  of  acetic  acid 
water,  dehydrated  and  mounted  with  a  resinous  medium. 
Sirius  red  stains  CD  in  a  deep  red  color  and  MT  in  yellow, 
while  Cl  areas  are  stained  in  a  pink  difuse  color  (see  Figure 
!)• 

B.  Image  Acquisition 

Images  were  acquired  with  a  Fujix  HC-2000  (1280  x  1024 
pixels  to  2560  x  2048)  high  resolution  digital  camera  cou¬ 
pled  to  an  Olympus  BX  50  microscope  with  an  Olympus 
U-BMAD  and  U-TU1  X  adapters  (Olympus,  Optical  com¬ 
pany,  Ltd.,  Japan).  Microscope  plate  was  remote  con¬ 
trolled  with  a  step  by  step  controller  PRIOR  H-101,  which 
was  communicating  to  a  Dell  PC  Penthium  III,  300  MHz 
with  384  RAM  Mb  through  a  RS232  line.  Analog  images 
were  then  digitized  at  24  bits  in  different  formats:  TIFF, 
JPEG  and  BMP  by  using  Photograph  II  SH-WSII  software. 
Control  of  the  automatic  microscope  plate  and  Photograph 
II  software  was  performed  by  a  home  made  software  writ¬ 
ten  in  Winbatch  2000  (WindowWare  Inc)  under  Windows 
NT4. 

Use  of  intermediate  lens  and  a  x  20  power  objective 
yielded  a  total  magnification  of  x900.  Screen  image  size 
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was  199  x  152  pm2  for  a  1 280  x  1 000  pixel  image,  resulting 
in  a  total  resolution  of  0.011  pm2 /pixel. 

C.  Image  Analysis 

A  transversal  section  of  muscle  was  digitized  and  one 
image  was  randomly  selected  to  perform  manual-automatic 
comparisons.  Segmented  image  surfaces  of  MT,  CD  and  Cl 
surfaces  were  quantified  in  both  raw  pixels  and  pm2. 

C.l  The  Bayes  risk  and  k-NN  decison  approach 

The  basic  structure  of  a  general  decision  problem  con¬ 
sists  of  a  set  of  decisions  D,  a  parameter  space  Q,  a  prob¬ 
ability  distribution  function  p(£l)  and  a  utility  function 
u(S(u>)),  whose  maximum  is  the  best  decision  S*(w).  Deci¬ 
sion  rules  are  evaluated  in  terms  of  their  average  loss  func¬ 
tion  l(J(x),w)  with  respect  to  the  data  which  might  arise 
and  is  given  by 

l(dw)  =  sup{u(di(w))  —  u(d(io))}  (1) 

di<iD 

where  x  is  the  observation  and  lo  a  particular  event.  The 
risk  function  r(S,  u>)  of  a  decision  rule  5  can  then  be  defined 
as 

r(S,u)  =  j  l(S(x),  w)p(x/w)dx  (2) 

Classical  decision  theory  focuses  on  the  decision  rule 
which  minimises  expected  risk  (or  so-called  Bayes  risk) 

Classification  problems  can  always  be  placed  between 
these  two  situations:  either  there  is  a  complete  knowledge 
of  the  statistical  properties  of  the  problem  or  the  unique 
statistical  knowledge  is  that  which  can  be  inferred  from 
some  samples.  In  the  former  case,  a  standard  Bayes  analy¬ 
sis  will  yield  an  optimal  choice  with  a  corresponding  min¬ 
imum  probability  of  error  of  classification:  the  Bayes  risk 
R* .  In  the  other  case,  the  decision  to  assign  the  observation 
x  into  a  given  category  6  is  dependent  on  the  corrrect  class- 
fication  of  n  samples  (xi,  0 1),  (x2, 02)---(xn,  9n)  and  belong 
to  the  domain  of  nonparametric  statistics. 

Given  the  knowledge  of  N  prototype  patterns  (vectors 
of  dimentsion  S)  and  their  correct  classification  into  M 
classes,  the  /cNN  rule  assigns  an  unclassified  pattern  to  the 
class  that  is  most  heavily  represented  among  its  k  neighbors 
in  the  pattern  space  (under  some  appropriate  metric) .  The 
first  formulation  of  this  rule  was  done  by  Fix  and  Hodges 
[12]  in  1951  who  established  the  consistency  of  the  rule  for 
sequences  such  that  k  — >  oo  and  k/N  — >  0.  Of  course, 
the  probability  of  error  R  of  this  rule  must  be  at  least  as 
large  as  the  Bayes  probability  of  error  R*  which  could  be 
achieved  with  perfect  knowledge  of  the  probability  den¬ 
sity  functions  of  each  class.  Cover  and  Hart  [11]  showed 
that  the  conditional  risk  for  the  1-NN  rule  is  bounded  by 
R  <  R* (2  —  mL\R*)  where  M  is  the  number  of  different 
classes.  This  is  close  to  R  <  2 R*  when  R*  is  small,  as 
usual  in  practical  applications.  For  the  fc-NN  rule,  the  risk 
is  bounded  by  (1  +  \)R*  [11]. 


C.2  classification  k-NN  in  RGB  color  space 

Implementing  the  fc-NN  rule  with  a  brute-force  method 
to  classify  F  patterns  with  N  prototypes  requires  F  x  N 
distance  computations  and  o{F  xNx  log(N))  comparisons, 
which  is  prohibitive  in  our  case..  Different  approaches  have 
been  devised  to  improve  algorithm  performances.  Some 
authors  [13],  [14],  [15],  [16]  reduce  the  number  of  prototypes 
to  consider  while  trying  not  to  affect  the  accuracy  of  the 
resulting  classification.  Jiang  and  Zhang  [17]  decomposed 
hierarchically  the  prototypes  into  disjoint  subsets  and  then 
applied  a  powerful  tree-search  algorithm.  Friedman  [18] 
orders  the  training  data  along  the  axis  with  the  maximum 
sparsity  for  each  pattern.  This  restricts  the  computations 
to  a  band  around  the  projection  of  the  test  data  onto  this 
axis.  Finally,  An  approach  that  is  better  suited  to  our 
problem  was  first  proposed  by  Warfield  [19]  who  applied  it 
to  double  echo  spin  echo  MR  image  classification.  In  such 
an  application,  the  number  of  possible  patterns  is  smaller 
than  the  number  of  patterns  to  classify,  so  that  it  becomes 
efficient  to  pre-compute  a  lookup  table  for  every  possible 
pattern,  then  to  classify  patterns  by  accessing  this  lookup 
table. 

The  computation  of  this  lookup  table  is  essentially  a  k- 
distance  transformation  problem.  Distance  transformation 
are  algorithms  that  compute  for  every  pixel  of  an  image  the 
distance  to  the  nearest  pixel  of  a  given  object.  The  /c-DT 
algorithm  used  here  is  better  detailed  in  Cuisenaire  [20]. 
Briefly,  a  fc-Euclidean  distance  transformation  is  computed 
by  using  ordered  propagation  to  scan  the  pattern  space, 
starting  from  the  prototype  patterns,  then  to  their  neigh¬ 
bors,  then  to  their  neighbors’  neighbors  and  so  on,  by  order 
of  increasing  Euclidean  distance.  The  strict  respect  of  the 
increasing  distance  order  is  achieved  by  bucket  sorting  of 
the  propagation  queue,  which  insures  that  no  unnecessary 
computations  are  performed. 

C. 3  Choice  of  parameters 

The  parameters  of  this  method  are  the  number  F  of 
training  samples  required,  the  number  k  of  nearest  train¬ 
ing  samples  used  by  the  fc-NN  rule  and  the  size  2D  B  of  the 
pattern  space  where  D  is  the  number  of  dimensions,  i.e  3 
for  the  RGB  space,  and  B  the  number  of  bits  used  to  quan¬ 
tify  each  color.  Empirically,  we  found  that  a  few  hundreds 
training  samples  per  class  were  sufficient  to  yield  good  re¬ 
sults.  To  determine  the  optimal  k  and  R,  we  performed 
10-fold  cross-validation  by  randomly  dividing  our  training 
data  in  10  sub-sets,  then  using  9  of  those  to  train  the  clas¬ 
sifier  and  the  remaining  one  to  evaluate  the  resulting  error 
rate  R.. 

D.  Segmentation  of  interstitial  fibrosis 

Manual  and  automatic  estimates  were  performed 
in  a  randomly  selected  image,  as  follows:  using 

Scion  Image  Beta  4.0.2  Win  software  (available  at 
http://www.scioncorp.com),  the  different  surfaces  were 
manually  dessigned  and  the  correspondig  surfaces  in  pixels 
calculated.  Automatic  algorithm  was  implemented  in  VTK 


software  with  a  TCL  script  as  interpreter.  Two  different 
expert  operators  performed  the  test  during  five  replications 
and  compared  them  with  five  automatic  replications.  The 
data  were  analyzed  with  Sygmastat  software  version  2.0.3 
(SPSS).  The  intraoperator  reproducibility  was  determined 
by  calculating  the  coefficient  of  variation  ( CV  =  SD/m). 
Intra  and  inter-operator  variabilities  in  the  manual  segmen¬ 
tation  of  MT,  CD  and  Cl  were  assessed  with  a  two-way 
ANOVA  test,  using  replication  and  operators  as  factors, 
followed  by  Tukey’s  test  as  a  post  hoc  test.  In  all  tests,  the 
differences  were  considered  to  be  significant  when  p  <  0.05. 

III.  Results 

A  subsection  of  a  typical  muscle  image  is  illustrated  in 
the  upper  part  of  Figure  1.  In  this  case,  the  image  subset 
(59.4  x  63.4  pm2)  shows  the  MT  surface  colored  in  yellow, 
the  CD  in  red  and  the  Cl  reaction  in  pink.  Note  the  high 
tissue  deformation  introduced  by  fibrosis. 


Fig.  1.  Top  figure  shows  an  image  subset  (59.4  X  63.4  jim1 )  in  which 
appeared  the  different  features  to  segmentate.  Bottom  image  corre¬ 
sponds  to  the  segmented  subset  image.  Four  different  colors  define 
the  classes  defined  on  this  image.  Pixels  colored  in  white  correspond 
to  the  CD  stained  in  red  in  upper  image.  Black  pixels  to  the  Cl, 
colored  in  pink, 

Four  different  classes  were  defined  on  this  image:  MT, 
CD,  Cl  and  artifact.  Artifact  class  corresponds  to  stain¬ 
ing  failures  and  usually  are  presented  as  black  spots.  Seg¬ 
mentation,  identification  and  classification  of  these  features 
were  achieved  by  using  162  (artifact),  297  (MT),  311  (CD), 
and  318  (Cl)  corrected  classified  pixels.  In  the  bottom  part, 
the  resulted  segmentated  image  is  displayed,  for  a  k  =  11. 
Pixels  are  colored  in  white,  black  and  two  different  gray 
levels  for  the  CD, Cl,  MT  and  artifact  classes,  respectively. 


A.  Parameter  selection 


Fig.  2.  Error  percentage  plotted  against  k  for  different  quantization 
steps. 

The  error  rates  R  for  1  <  k  <  13  and  3  <  B  <  7  are 
shown  at  Figure  2.  Satisfying  results  are  obtained  as  long 
as  k  >  5  and  B  >  6.  For  k  =  5  and  B  =  6,  the  size  of  the 
fc-DT  space  is  5  x  86  =  1310720,  which  requires  a  couple  of 
seconds  to  compute  the  LUT.  CPU  time  increases  linearly 
with  k  and  is  multiplied  by  8  for  each  extra  bit  used  to 
quantify  the  colors.  Practically,  we  routinely  use  k  =  11 
and  B  =  6  or  7. 

B.  Manual- Automatic  comparisons 

Manual  measurements  performed  in  the  same  image  for 
five  replication  were  statistically  tested.  Results  are  shown 
in  table  1. 

TABLE  I 

Manual-Manual  comparisons  were  performed  using  the  same 
image.  Conventions:  Inter:  Interoperator  differences; 
Intra:  Intraoperator  differences.  Significant  when  p  <  0.05, 

MARKED  WITH  ASTERISK. 


Tissue 

Inter 

Intra 

MT 

p  =  0.266 

p=  0.072 

CD 

p  =  0.042* 

p  =  0.849 

Cl 

p  =  0.088 

p  =  0.587 

Data  show  significant  interoperator  differences  when  con¬ 
sidering  the  CD  tissue.  All  other  comparisons  did  not  show 
significant  differences. 

TABLE  II 

Manual- Automatic  comparisons  were  performed  using  the 
same  image.  Same  conventions  as  in  table  1. 


Tissue 

Inter 

Intra 

MT 

p  =  0.561 

p=  0.895 

CD 

p  =  0.183 

p=  0.687 

Cl 

p  =  0.706 

p=  0.601 

Table  2  shows  statistical  tests  performed  with  the  set  of 
measurements  obtained  by  the  same  operator  (both  auto¬ 
matic  and  manual).  Data  did  not  differ  significantly. 

TABLE  III 

Coefficient  of  variation  for  the  two  expert  operators  and 

THE  AUTOMATIC  METHOD  DURING  FIVE  REPLICATIONS  IN 
PERCENTAGE.  CONVENTIONS:  OpI:  OPERATOR  1;  Op2;  OPERATOR 
2;  Auto:  Automatic  method. 


Tissue 

Opl 

0p2 

Auto 

MCT 

1% 

1% 

1% 

PCBF 

11% 

16% 

11% 

CISLC 

22% 

6% 

7% 

Finally,  table  3  shows  the  CV  for  the  two  operators  and 
the  automatic  method.  Data  show  a  larger  dispersion  for 
the  two  manual  measurements. 

IV.  Discusion 

Compared  with  conventional  methods,  automatic  ap¬ 
proaches  supplies  additional  new  information  and  provides 
faster,  more  precise  and  reproducible  quantifications.  In 
the  present  study,  we  propose  a  novel  image  analysis  auto¬ 
matic  algorithm  to  segment  and  classify  areas  of  intersti¬ 
tial  fibrosis.  The  algorithm  allows  to  quickly  obtain  accu¬ 
rate  segmentations.  Lookup  table  creation  mean  is  about 
12  seconds/image  and  image  quantification  about  4  sec¬ 
onds/image,  depending  on  the  number  of  selected  proto¬ 
type  pixels  and  the  ^-dimensional  problem  i.e.  the  number 
of  neighbors  on  which  the  decision  is  based.  Also,  as  shown 
before,  the  interoperator  variabilities  of  manual  measure¬ 
ments  were  significantly  different,  while  intraoperator  val¬ 
ues  for  MT  were  close  to  the  sensibility  statistical  thresh¬ 
old.  This  results  show  an  important  interoperator  vari¬ 
ability  that  are  much  less  important  with  the  automatic 
method.  Also,  reproducibility  is  much  reliable  with  the 
automatic  method.  There  are  two  main  factors  that  in¬ 
fluence  the  reproducibility.  First,  the  probability  distribu¬ 
tions  of  the  different  colors  in  the  image  can  overlap  each 
other.  This  correspond  to  the  Bayesian  risk  and  can  be  di¬ 
minished  by  increasing  color  image  contrasts.  This  implies 
being  extremely  careful  at  all  steps  towards  well  stained 
sections.  Second,  the  noise  than  can  be  introduced  by  the 
operator  at  selecting  the  initial  well  classified  pixels  and 
that  is  directly  related  to  the  image  quality.  On  the  other 
hand,  with  this  method,  the  microscope  acquisition  set¬ 
tings  become  less  important  since  the  algorithm  classifies 
the  different  colors  based  on  the  correctly  classified  pat¬ 
terns  that  the  observator  supplies.  Of  course,  any  effort 
to  increase  the  difference  between  colors  can  improve  the 
algorithm  performance,  but  an  acceptable  eye  naked  defi¬ 
nition  is  sufficient  to  yield  reproducible  results,  as  shown 
in  last  section. 


ferent  tissues  present  in  muscle  sections  stained  with  Sirius 
red.  An  objective  and  accurate  quantification  of  the  inter¬ 
stitial  fibrosis  reaction  is  fundamental  to  detect  early  struc¬ 
tural  modifications  in  pathologies  such  as  the  Duchenne 
dystrophy.  This  reasoning  can  be  easily  extended  to  all 
pathologies  in  which  fibrosis  occupy  a  determinant  role  and 
in  which  morphometry  of  this  changes  is  usually  difficult 
due  to  the  large  irregularities  in  color  and  shape  introduced 
by  this  kind  of  reaction,  as  can  be  deduced  from  Figure  1. 
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